Generalized ocean color inversion model for retrieving 
marine inherent optical properties 
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Ocean color measured from satellites provides daily, global estimates of marine inherent optical proper- 
ties (IOPs). Semi-analytical algorithms (SAAs) provide one mechanism for inverting the color of the 
water observed by the satellite into IOPs. While numerous SAAs exist, most are similarly constructed 
and few are appropriately parameterized for all water masses for all seasons. To initiate community-wide 
discussion of these limitations, NASA organized two workshops that deconstructed SAAs to identify simi- 
larities and uniqueness and to progress toward consensus on a unified SAA. This effort resulted in the 
development of the generalized IOP (GIOP) model software that allows for the construction of different 
SAAs at runtime by selection from an assortment of model parameterizations. As such, GIOP permits 
isolation and evaluation of specific modeling assumptions, construction of SAAs, development of region- 
ally tuned SAAs, and execution of ensemble inversion modeling. Working groups associated with the 
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workshops proposed a preliminary default configuration for GIOP (GIOP-DC), with alternative model 
parameterizations and features defined for subsequent evaluation. In this paper, we: (1) describe the 
theoretical basis of GIOP; (2) present GIOP-DC and verify its comparable performance to other popular 
SAAs using both in situ and synthetic data sets; and, (3) quantify the sensitivities of their output to their 
parameterization. We use the latter to develop a hierarchical sensitivity of SAAs to various model pa- 
rameterizations, to identify components of SAAs that merit focus in future research, and to provide 
material for discussion on algorithm uncertainties and future emsemble applications. © 2013 Optical 
Society of America 
OCIS codes: 010.4450, 280.4991. 


1. Introduction 

Satellite ocean color instruments, such as the NASA 
Moderate Resolution Imaging Spectroradiometer on- 
board Aqua (MODISA) and ESA Medium Resolution 
Imaging Spectrometer (MERIS), provide daily global 
estimates of marine inherent optical properties 
(IOPs). Remotely sensed IOPs, namely the spectral 
absorption and backscattering coefficients of the 
surface water column and its particulate and dis- 
solved constituents, describe the contents of the 
upper ocean, information critical to furthering scien- 
tific understanding of biogeochemical processes, such 
as carbon exchanges, phytoplankton biodiversity 
shifts, and responses to climatic disturbances. As 
such, the international community has invested 
significant effort in ensuring and improving the 
quality of remotely sensed IOPs. Recognizing this, 
the International Ocean Colour Coordinating Group 
(IOCCG) established a working group to compile and 
compare popular methods for estimating IOPs from 
ocean color [1]. NASA recently extended this effort 
through a series of internationally attended work- 
shops aimed at: (a) deconstructing each method to 
identify similarities and uniqueness; (b) identifying 
strategies to provide uncertainties for each method; 
and (c) achieving community-wide consensus on a 
unified method for generating global satellite IOP 
products [2]. 

Semi-analytical algorithms (SAAs) provide one 
mechanism for inverting ocean color into IOPs 
through a combination of empiricism and radiative 
transfer theory. Many published SAAs attempt to 
retrieve spectral IOPs from spectral remote-sensing 
reflectances [R rs (l); sr _1 ] and to further separate the 
total absorption and backscattering coefficients into 
subcomponents [e.g., the absorption of algal, nonal- 
gal, and colored dissolved organic matter (CDOM)] 
[3-11] . These SAAs generally assume spectral shape 
functions (or eigenvectors) of the constituent absorp- 
tion and scattering components and retrieve the 
magnitudes (or eigenvalues) of each constituent 
required to match the spectral distribution of R rs (l). 
Thus, SAAs often differ only in the assumptions 
employed to define the eigenvectors and in the 
mathematical methods applied to calculate the ei- 
genvalues. In lieu of this, and in support of the afore- 
mentioned workshops, the NASA Ocean 
Biology Processing Group (OBPG) [ 12 ] developed 


the generalized IOP (GIOP) framework to facilitate 
controlled evaluation of the varied SAA approaches 
within a satellite data processing environment [13], 
Briefly, GIOP allows construction of different IOP 
models at runtime by selection from a wide assort- 
ment of published absorption and backscattering 
eigenvectors [4,6-8,14-17]. As such, GIOP permits 
isolation and evaluation of specific modeling assump- 
tions, construction of new SAAs, development of 
regionally tuned SAAs (e.g., [IB]), and ensemble in- 
version modeling. The OBPG implemented GIOP is 
standard NASA ocean color processing code [ 19 ] and 
currently distributes GIOP to the research commu- 
nity via the Sea-viewing Wide Field-of-View Sensor 
(SeaWiFS) Data Analysis System (SeaDAS) [20] . The 
working group associated with the NASA GIOP 
workshops (October 2008 and September 2010) [2] 
proposed the preliminary configuration of GIOP, 
with alternative model parameterizations and fea- 
tures defined for subsequent evaluation. In this 
paper, we: (a) describe the theoretical basis of GIOP; 
(b) present its preliminary configuration; (c) evaluate 
and validate this configuration; and (d) quantify the 
sensitivities of output eigenvalues to their assumed 
eigenvectors. We provide a brief summary of the 
model components within GIOP as compared to 
several common SAAs in Appendix A. 

2. Methods 

A. Model Development 

Ocean color satellite instruments provide estimates 
of R rs (A) after atmospheric correction. The method of 
Lee et al. [7] provides a convenient method to convert 
R TS (A) to their subsurface values: 


r TS U, 0 ) 


RrsW 

0.52 + 1.7 R rs (4) ' 


( 1 ) 


Subsurface remote-sensing reflectances relate to 
marine IOPs via 


r rs fy. 0") = Gfylfyfy) + G 2 (A)u(A) 2 , (2) 

where the two G(A) (sr -1 ) vary with illumination 
conditions and geometry, sea surface properties, 
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bidirectional effects, and the shape of the marine eigenvalue and a power function often represents 

volume scattering function. Most often, u(A) is the eigenvector: 

described as 


u{A) = 


a(/ 1) + b b (/ 1) ’ 


(3) 


where b b is the total backscattering coefficient (m -1 ) 
and a is the total absorption coefficient (m -1 ) [21], but 
formulations such as u(A) = b b (A)/a(A) have also 
been adopted [22]. Common methods for estimating 
G(A) include Gordon et al. [21], where Gi and G 2 are 
spectrally fixed to 0.0949 and 0.0794 (see [7,23] for 
alternative coefficients), and the tabulated results 
of Morel et al. [22], where Gj is estimated using solar 
and sensor geometries and an estimate of algal bio- 
mass and G 2 is set to 0. GIOP supports all of these 
options. In practice, most SAAs first solve for u(A), 
then decompose (or, invert) u{A) into its component 
IOPs. 

The absorption coefficient can be expanded as 
the sum of all absorbing components. Further, 
each component can be expressed as the product 
of its concentration-specific absorption spectrum 
(eigenvector; a*) and its concentration or amplitude 
(eigenvalue; A): 


N * 

a(A) = a w (A) + Y A <k a *i>Y 
1=1 

N d N g 

+ ^A di a^.(/l) + Y^A gi a*.(A), (4) 

i = 1 i = 1 


where the subscripts w, </>, d, and g indicate contribu- 
tions by water, phytoplankton, nonalgal particles 
(NAP), and CDOM. Both a d (A) and a g (A) are com- 
monly expressed as 


b* bp (A)=A s », (8) 

where S bp typically varies between -2 and 0 from 
small to large particles. While commonly employed 
in the remote-sensing paradigm, we acknowledge 
the validity of the power function for bl p (A) remains 
debatable [25-27]. Both a w (A ) and b bw {A) are 
known [ 28 , 29 ]. 

Using R rs (A) and eigenvectors as input, eigenval- 
ues for absorption (A) and backscattering (B) can 
be estimated via linear or nonlinear least squares 
inversion of Eqs. (l)-(3). For example, Roesler and 
Perry [3] used the nonlinear optimization scheme 
of Levenberg-Marquardt (LM), while Hoge and Lyon 
[4] showed the problem could be linearized and thus 
directly solved via linear matrix inversion. GIOP 
supports both linear and nonlinear optimization 
and matrix inversion schemes [13]. The optimized 
eigenvalues represent the relative contributions of 
each defined absorbing and scattering constituent. 
In the special case where a*AA) is provided as 
chlorophyll-specific absorption (m 2 mg -1 ), the eigen- 
value A (/ , provides an estimate of the chlorophyll 
concentration, C a (mgm -3 ). Note that this model 
describes each component of absorption and back- 
scattering as a linear sum of subcomponents, 
presumably with unique spectral dependencies [sym- 
bolized by the summation over N in Eqs. (4), (6), and 
(7)]. In this way, the absorption characteristics for 
different phytoplankton populations and the scatter- 
ing characteristics of multiple size distributions of 
suspended particles can be represented, or Eq. (6) 
can be re-expanded to Eq. (4). 


a* dg {A) = ex P (S d j,A), (5) 

where S d and S g typically vary between 0.01 and 
0.02 nm -1 in natural waters [24]. As the spectral 
shapes of NAP and CDOM absorption differ only 
in their exponential slopes, the two components 
are typically combined for satellite applications 
and Eq. (4) becomes 

a(A) = a w (A ) + X^A^.aJ.G) + Y^A dgi a* dg {X). (6) 

i=l i = 1 

which is the default option for GIOP. 

Total backscattering can be expanded to 

N bp 

b b (A) = b bw {A ) + YB bPi b* h JA), (7) 

i = 1 

where the subscripts bw and bp indicate contribu- 
tions by seawater and particles. B bp provides the 


B. Model Configuration 

GIOP provides a satellite data processing framework 
within which an SAA can be constructed at runtime 
by selection from an assortment of published eigen- 
vectors (Table 1). Franz and Werdell [ 13 ] provide a 
detailed description of its use within the SeaDAS 
environment [20]. The general form of the GIOP 
inversion model (Section 2 .A ) is common to a number 
of published approaches (e.g., [3-5,8,9,30]) whose 
differences reside in the choice of eigenvectors 
employed, the number of eigenvalues resolved, the 
optimization method selected, and the number of 
sensor wavelengths considered in the optimization. 
A unique instance of GIOP is therefore defined by 
specifying eigenvectors for each optically significant 
constituent assumed to exist in the water column. 
The working groups associated with the NASA GIOP 
workshops [2] proposed the following preliminary, 
default configuration of GIOP, hereafter referred to 
as GIOP default configuration (GIOP-DC), to support 
NASA’s production and distribution of global IOP 
data products: 
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Table 1. Summary of Eigenvectors Available for Use in GIOP (as of March 2011)* 


Eigenvector 

Description 

Reference 

oJG) 

User-provided aJO 

Maritorena et al. (2002) tabulated aJO 

[8] 


Bricaud et al. (1998) -derived aJO using OC-derived C a 

[14] 


Ciotti and Bricaud (2006 (-derived a JO using user-provided size fraction 

[17] 

a * *dgG) 

Eq. (5) with user-provided S,i g 

Eq. (5) with Lee et al. (2002 (-derived S dg 

[7] 


Eq. (5) with Franz and Werdell (2010)-derived S dg 

[13] 


User-provided a dg O 

Eq. (8) with user-provided S bp 

Eq. (8) with Hoge and Lyon (1996)-derived S bp 

[4] 


Eq. (8) with Lee et al. (2002)-derived S bp 

[7] 


Eq. (8) with Ciotti et al. (1999)-derived S bp 

[15] 

K P w 

Eq. (8) with Morel and Maritorena (2001)-derived S bp 

[22] 

Eq. (8) with Loisel and Stramski (2000)-derived S bp 

[6] 


User-provided 6J p O 

Loisel and Stramski (2000)-derived 6j p O 

[6] 


Lee et al. (2002)-derived bl p (X) 

[7] 


“Boldface indicates the eigenvector used in GIOP-DC. 


• Relate r rs (A, 0 ) to IOPs using Eq. (2) and 


_ b bw (X)+B hp b bp (X) 

bbwW+B bp bl p (A) + a w (A) +A dg a dg (X) +A ( j ) a* lj) (X) 

(9) 

• Spectrally invariant Gi and G 2 from Gordon 
et al. [ 21 ] 

• bl (X) from Eq. (8) and S bp from Lee et al. [7] 

• a* da {X) from Eq. (5) and S d „ = 0.018 nnr 1 [ 24 ] 

• a^iX) from Bricaud et al. [14], estimated using 
OC-derived C a [31] and normalized such that 
oJ( 443) = 0.055 m^rng -1 . 

The latter normalization allows the spectral shape of 
a^(X) to change with an estimate of C a , but constrains 
its magnitude to an average value for oceanic 
water [3,8,14]. 

Our choice of default parameterizations served two 
purposes: to provide some consistency with previ- 
ously developed SAAs and to acknowledge the 
emerging quality of variable, dynamically selected 
eigenvectors. In particular, we adopted Gq and G 2 
from Gordon et al. [21] to be consistent with previ- 
ously published work [3-5, 7-9] and variable S bp 
and a^(X) to better represent heterogeneous natural 
environments than would be possible with fixed val- 
ues. Our choice of a fixed S dg represents a compro- 
mise between multiple published values [8,24], 
although we expect GIOP-DC to ultimately adopt a 
variable S dg as methods for its dynamic calculation 
improve. Note, we did not make these choices based 
on comparisons with in situ or synthetic data sets, 
nor do we suggest that these parameterizations re- 
present all natural oceanic conditions at all times. 
Our choices simply represent a consistent, modern 
SAA configuration from which to evolve as our under- 
standing of marine bio-optics improves. 
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In GIOP-DC, the three unknown eigenvalues 
{B bp ,A dg , and A^) are optimized using LM, and wave- 
lengths between 400 and 700 nm are considered. Our 
convergence criterion tests both absolute and rela- 
tive changes in the optimized eigenvalues between 
iterations of the LM algorithm. Convergence is 
reached when the change in any eigenvalue, X, 
between iteration i and iteration i + 1 satisfies 
\X i+ i -X t \ < 0.0001 + 0.0001|X;|. If the number of 
iterations exceeds 50, the iteration is stopped and 
the result is treated as algorithm failure. Note that 
these criteria do not require the results to be valid or 
even that the model provides a good fit to the data. 
Despite this, we consider the IOP retrievals to be 
valid when 

• -0.05 b bw (X) < b bp (X) < 0.05 m _1 

• -0. 05a w (X) < a dg (X) < 5 m -1 

• -0.05au,(/i) < a^(4) < 5 nT 1 

• A R rs < 33%, 

where the mean relative R rs (4) difference, a measure 
of goodness of fit, is derived as 


_ ioo%^ |R rs (4)-R rs (4)l 

Nx {z i B rs {Xi) 


GO) 


for 400 < A < 600 nm with R rs indicating remote- 
sensing reflectance reconstructed from the modeled 
IOPs. We consider slightly negative IOP values to be 
viable (rather, statistically equivalent to zero), as 
the retrieved eigenvalues include some uncertainty. 
We provide a brief comparison of GIOP-DC ( and GIOP, 
in general) and several other SAAs commonly used in 
satellite ocean color applications in Appendix A. 

C. Uncertainties 

The LM optimization method minimizes a / 2 merit 
function defined as 


2 (Rrsi^i) ~ Rrsik)) 2 M 

x = 2 ^ alti (n ) 

i=l ° wi) 

where a (A;) are the input uncertainties on R r J). l ). LM 
makes use of the Jacobian matrix ( J ), which is 
derived by evaluating the partial derivatives of the 
merit function with respect to the optimized eigen- 
values at each A;. From the Jacobian, the covariance 
matrix (M) is then constructed as M = (J T J)~ l , re- 
sulting in a square matrix with dimensionality equal 
to the number of model eigenvalues. The square-root 
of the diagonal elements of M represent the uncer- 
tainty on the optimized eigenvalues, taking into 
account both the weighting and scale of the input 
uncertainties. If reliable values of <7(7,) are not 
available, they are set to 1.0 and the optimization 
is unweighted. In that case, the uncertainty is com- 
puted using the square-root of the diagonal terms of 
the variance-covariance matrix, a 2 M, where 



The GIOP-DC configuration of GIOP currently uti- 
lizes an unweighted optimization with the variance- 
covariance matrix utilized to estimate uncertainties 
on the model eigenvalues. The uncertainty on the 
derived IOPs is then determined by multiplying 
the uncertainty of the optimized eigenvalue by the 
associated eigenvector. 

D. Data Acquisition 

We acquired coincident observations of in situ and 
synthesized i? rs (l), b bp (X), a dg (X), a^(A), and C a from 
the NASA bio-Optical Marine Algorithm Data 
set (NOMAD) [32] and the IOCCG Ocean Colour 
Algorithms Working Group synthetic data set [1] . We 
also acquired coincident Level-2 satellite-to-m situ 
match-ups for the NASA SeaWiFS and MODISA in- 
struments from the OBPG (http://seabass.gsfc.nasa 
.gov/seabasscgi/validation_search.cgi). Satellite data 
processing and quality assurance for the match- 
ups followed Bailey and Werdell [33]. Specifically: 
(1) temporal coincidence was defined as ±3 h; (2) sat- 
ellite values were the filtered median (via the semi- 
interquartile range) of all unmasked pixels in a 5 x 5 
box centered on the in situ target; and (3) satellite 
values were excluded when the coefficient of varia- 
tion for the given product within this box exceeded 
0.15. SeaWiFS and MODISA data were processed fol- 
lowing their R2010.0 (September 2010) and R2012.0 
(May 2012) reprocessing configurations, respectively 
[ 34 ] . The in situ data used in the match-up analyses 
include a small subset of NOMAD, plus additional 
R rs (X), C a , b bp (X), a dg (A), and a,p(A,) archived in the 
NASA SeaWiFS Bio-optical Archive and Storage 
System (SeaBASS) [35]. Sample sizes for b bp (X) differ 
from a (A), a dg (X), and a,/,(X) in NOMAD and the 
match-up data sets because of instrumental variabil- 
ity in field campaigns included in these compilations. 


E. Validation Analyses 

We adopted a validation approach based on model- 
measurement regression statistics and a spectral 
goodness of fit metric. We compared modeled and 
ground-truth IOPs, with the former calculated using 
in situ, synthesized, and satellite R IS (X) from the 
NOMAD, IOCCG, and satellite-to-m situ match-up 
data sets. We generated Type II linear regression 
statistics and estimates of AIOP for all of the above, 
with AIOP defined as 

aio P = ^^^op M -iop M 
N * f-( IOP(/fi) + IOPOd 

for 400 < A < 600 nm. IOP and IOP indicate a mea- 
sured and modeled absorption or backscattering 
coefficient, respectively. Accordingly, we report AIOP 
elsewhere as A b bp , A a, A a dg , and A a^. We generated 
modeled IOPs using in situ R rs (X) and calculated 
AR rs and AIOP distribution medians and semi- 
interquartile ranges for various populations (here, 
the semi-interquartile range is the range covered by 
values such that 50% occur with equal probability on 
either side of the median). Furthermore, for the 
analysis of these delta statistics, we stratified the 
ground-truth stations into three trophic levels: oligo- 
trophic water has C a <0.1 mg nr 3 ; mesotrophic 
water has 0.1 <C Q <1 mgm 3 ; and eutrophic water 
has C a > 1 mgm" 3 . Table 2 provides calculations for 
other comparative statistics. 

F. Sensitivity Analyses 

We executed 12 additional unique instances of GIOP 
to evaluate the sensitivity of the GIOP-DC configu- 
ration to alternate selections of eigenvectors. Each 
of the 12 instances included a single alternate 
parameterization: (1,2) S dg ± 33% (=0.012 and 
0.024 nm" 1 ), (3) S dg dynamically calculated using 
Lee et al. [7], (4,5) S bp from Lee et al. [7] ±33%; (6,7) 
OC-derived C a ± 33% prior to input into Bricaud 
et al. [14]; (8) a^(A) from Bricaud et al. [ 14 ] with 
C a fixed at 0.18 mgm -3 ; (9) aJ(A) from Ciotti and 
Bricaud [L7] with a size fraction of 0.5; (10) G(X) from 
Morel et al. [22], (11) optimization using linear ma- 
trix inversion, and (12) optimization considering only 
400 < A < 600 nm. We quantified spectral changes in 
modeled IOPs for each alternate parameterization 
(relative to the baseline GIOP-DC configuration) us- 
ing Type II regression statistics, estimates of AIOP, 
and Taylor [ 36 ] and Target [ 37 ] summary diagrams. 
The latter provide convenient means for simultane- 
ously considering magnitudes of deviations, correla- 
tions, and biases between the alternate runs and 
GIOP-DC. To minimize the impact of uncertainties 
inherent to the in situ IOP measurements, we only 
considered the synthetic IOCCG data set in these 
analyses (note, this does not remove uncertainties 
in the truth data set, but rather, limits them to those 
associated with the inherent assumptions used to 
generate this synthetic data set). 
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Table 2. Regression Statistics for GIOP-DC Using the NOMAD and IOCCG Data Sets* 


NOMAD IOCCG 



N 

r 2 

Slope (SE) 

Ratio 

MPD 

N 

r 2 

Slope (SE) 

Ratio 

MPD 

412 

217 

0.49 

0.99 (0.05) 

1.26 

28.2 

437 

0.92 

1.08 (0.01) 

0.98 

10.1 

443 

217 

0.52 

1.00 (0.05) 

1.27 

28.8 

437 

0.93 

1.07 (0.01) 

1.00 

9.3 

490 

217 

0.57 

1.04 (0.05) 

1.29 

30.0 

437 

0.94 

1.07 (0.01) 

0.99 

8.2 

555 

217 

0.60 

1.06 (0.05) 

1.30 

32.8 

437 

0.95 

1.09 (0.01) 

0.91 

11.8 

670 

217 

0.64 

1.09 (0.05) 

1.27 

31.3 

437 

0.96 

1.09 (0.01) 

0.90 

13.0 

412 

649 

0.90 

1.14 (0.01) 

0.89 

20.5 

421 

0.99 

1.03 (0.00) 

1.03 

7.2 

443 

657 

0.90 

1.13 (0.01) 

0.88 

21.8 

427 

0.99 

1.03 (0.01) 

0.99 

6.1 

490 

657 

0.88 

1.13 (0.02) 

0.84 

24.6 

432 

0.98 

1.03 (0.01) 

0.98 

8.7 

555 

654 

0.81 

1.21 (0.02) 

0.77 

37.4 

436 

0.95 

1.11 (0.01) 

0.73 

29.9 

670 

609 

0.80 

1.18 (0.02) 

1.11 

37.4 

424 

0.86 

1.01 (0.02) 

1.50 

58.6 

412 

654 

0.83 

1.12 (0.02) 

0.84 

28.8 

426 

0.98 

1.02 (0.01) 

1.01 

13.1 

443 

662 

0.81 

1.10 (0.02) 

0.78 

35.1 

429 

0.97 

1.04 (0.01) 

0.90 

15.9 

490 

662 

0.76 

1.06 (0.02) 

0.65 

43.2 

435 

0.95 

1.07 (0.01) 

0.80 

25.0 

555 

656 

0.71 

1.03 (0.02) 

0.53 

52.7 

437 

0.91 

1.08 (0.02) 

0.58 

44.1 

670 

621 

0.61 

0.96 (0.03) 

0.32 

69.8 

426 

0.82 

1.05 (0.02) 

0.36 

64.2 

412 

676 

0.76 

1.23 (0.02) 

0.94 

27.5 

419 

0.81 

1.18 (0.03) 

1.16 

32.8 

443 

682 

0.76 

1.21 (0.02) 

0.98 

25.9 

419 

0.79 

1.13 (0.03) 

1.20 

30.3 

490 

682 

0.76 

1.23 (0.02) 

1.08 

27.2 

422 

0.78 

1.13 (0.03) 

1.34 

41.4 

555 

673 

0.78 

1.22 (0.02) 

1.25 

42.5 

422 

0.84 

1.10 (0.02) 

1.36 

52.5 

670 

680 

0.82 

1.12 (0.02) 

1.33 

43.3 

422 

0.81 

1.00 (0.02) 

2.01 

100.8 


a N, r 2 , and Slope (SE) are the sample size, coefficient of determination, and regression slope (standard error of the regression slope), 
respectively. Ratio is the median ratio calculated as median(X i /.X';) and MPD is the median percent difference calculated as 
median(100% * | XJXi - 1|), with X t and A, indicating each modeled and measured IOP, respectively. With the exception of Ratio 
and MPD, data were log-transformed. 


3. Results 

Direct comparisons of modeled and ground-truth 
IOPs provided estimates of the accuracy and preci- 
sion of GIOP-DC. Overall, spectral IOPs from 
GIOP-DC (indicated by a carot hat below) repro- 
duced the in situ and synthetic values moderately 
well over the full dynamic range of the NOMAD 
and IOCCG data sets, with r 2 ranging from 0.49 to 
0.90 and 0.78 to 0.99, respectively (Fig. 1, Table 2). 
With the exception of a^(A), median percent 
differences (MPD; the caption for Table 2 provides 
its calculation) for NOMAD showed greater overall 
variability than for the IOCCG data set, with values 
ranging largely from 20%-40% compared to 
10%-30%. Both data sets showed less variability 
for b bp (A) and a (a) than for the component absorption 
products, a dg (A) and a^A) [38,39]. As demonstrated 
by their median ratios, b bp (A) and a (A) for NOMAD 
showed high and low biases (over and underesti- 
mates), respectively, while both approached unity 
for the IOCCG data set. Relative to the ratios for 
a(A), ratios for a dg (X) and a,/, (A) from both data sets 
showed low and high biases, respectively, although 
the IOCCG results better demonstrated this behav- 
ior. Interestingly, the absolute magnitudes of the 
biases in component absorption products increased 
with increasing wavelength for both data sets, pos- 
sibly due to the increasing contribution of a w (A ) to 
a (A) with increasing wavelength. The magnitudes 
of the regression slopes for b bp (X) and a dg (A) from 


NOMAD also showed strong, albeit opposite, spectral 
dependency. 

Not surprisingly, comparisons of modeled and 
in situ IOPs for SeaWiFS and MODISA resulted in 
similar statistics to those for NOMAD, as the in situ 
IOPs used in the NOMAD and satellite comparisons 
stemmed from similar sources (Fig. 2, Table 3). This 
also implies, however, that the satellite and in situ 
R rs {A) carry similar spectral uncertainties or that 
the inversion process remains somewhat insensi- 
tive to uncertainties in R rs (4). The SeaWiFS and 
MODISA samples sizes and dynamic ranges were 
far lower than those for the NOMAD and IOCCG 
comparisons, yet their MPD were higher, at least 
for the absorption coefficients. Similar to the 
NOMAD and IOCCG results, MPD for a (lg (A) and 
a^(X) exceeded those for b bp (X) and a(A). The median 
ratios for a dg (X) and a,p(A) showed definitive spectral 
dependency, but only partial low and high biases rel- 
ative to the ratios for a (A). Similar to the NOMAD 
and IOCCG results, the magnitudes of the slopes 
for b bp (A) and a dg (A) for SeaWiFS showed spectral 
dependencies. Such behavior was not obvious for 
MODISA, presumably given its small sample size 
for a dg (A). As for the NOMAD and IOCCG results, 
GIOP-DC appeared to best retrieve a(A), echoing 
the findings of an IOCCG working group [1]. 

GIOP-DC ran successfully on 90% of stations in 
NOMAD and the IOCCG data set, independent of tro- 
phic level (Table 4). The 10% failure rate resulted from 
a combination of A R IS > 33% and nonconvergence of 
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Fig. 1. Comparison of GIOP-DC and ground truth (in situ) IOPs 
at 443 nm from NOMAD (black) and the IOCCG data set (red). The 
left column shows scatter plots for regression analyses. The right 
column shows ratios of GIOP-DC to ground truth. See Table 2 for 
accompanying statistics. 


the inversion. When successful, A R rs fell below 2% on 
average for all trophic levels for both data sets 
(Fig. 3). While this indicates the eigenvectors can ac- 
curately reproduce R rs (/), it does not necessarily sug- 
gest that GIOP-DC employs ideal eigenvectors. In 
contrast to A R rs , AlOPs fell between 25%-50% for 
NOMAD (excluding the oligotrophic subset) and 
5%-35% for the IOCCG data set (excluding the eutro- 
phic, Aa v/ „ Table 4). As will be discussed later, some 
variability in the NOMAD results stems from the 
in situ data, most of which fail to accurately achieve 
radiometric closure between R IS (A) and IOPs because 
of G(A) or measurement error [see Eq. (2)]. The 
IOCCG data set, however, maintains radiometric clo- 
sure by design and reported A b bp and Aa distribution 
means and modes below 15% (Fig. 4). As for the 
match-up results, delta values for the component ab- 
sorption products, A a dg and A were degraded rel- 
ative to Aa. The latter is true for the full IOCCG data 
set and for the meso- and eutrophic subsets of 
NOMAD, which dominate its sampling distribution. 
For the IOCCG data set, A a dg remained consistent 
for the three trophic levels (28% ± 3%), while A in- 
creased with increasing trophic level. This perhaps 


resulted from the decreasing relative concentration 
of a^,(A) to total absorption. The oligo-, meso-, and eu- 
trophic subsets of the IOCCG data set maintain 
median ratios for a^( 443) to a^( 443) + a^(443) of 
0.40, 0.31, and 0.23, respectively. 

A hierarchy emerges from the sensitivity analyses 
performed on the IOCCG data set using alternate pa- 
rameterizations of eigenvectors. Under- and overesti- 
mation of S bp produced eigenvalues that differed 
from GIOP-DC by only 2%— 8% (MPD), with slightly 
elevated AlOPs (Table 5). This indicates the re- 
trieved eigenvalues to be fairly insensitive to S bp , 
particularly given that GIOP-DC employs a dynamic 
calculation of this eigenvector [7], Modifying the C a 
used as input into Bricaud et al. [ 14 ] to retrieve a*^{X) 
produced eigenvalues that differed from GIOP-DC by 
only 1%— 7% (MPD), with very comparable AlOPs. 
We suspect this results from stability in a'^(A) from 
the Bricaud model over narrow ranges of C a (±33%). 
Not surprisingly, the use of a fixed a*^{A), from either 
Bricaud et al. [ 14 ] or Ciotti and Bricaud [17], resulted 
in eigenvalues that differed more significantly from 
GIOP-DC, particularly in a^(A). This less dynamic 
approach to assigning eigenvectors cannot as effi- 
ciently represent all water types at all times and, 
as such, these two runs appear as outliers with re- 
gard to standard deviations in the Taylor and Target 
diagrams (Figs. 5-7; blue circle and green square). 
However, while these two fixed a*/, (A) runs returned 
somewhat elevated Ab bp , Aa, and A a dg relative to 
GIOP-DC, they returned improved Aa, /( . The choice 
of S dg appears to be the most critical within the con- 
text of this experiment, at least with regard to sepa- 
rating total absorption into its algal and nonalgal 
components [38] . The three changes to S dg produced 
the three most significant departures from GIOP-DC 
in retrieved eigenvalues. Reducing S dg to 0.012 nm -1 
produced the highest A b bp and Aa. This run appears 
as an outlier with regard to standard deviation and 
correlation in the Taylor and Target diagrams 
(Figs. 5-7; blue cross). In contrast, elevating S dg to 
0.024 nm -1 produced the highest A a dg and Aa,/,. 
Dynamically selecting S dg via Lee et al. [7] produced 
an equivalent AIOP to GIOP-DC and the lowest over- 
all A a dg of all the runs, emphasizing a potential ben- 
efit from dynamically assigning S dg . 

Choices made for the inversion itself also impacted 
the retrieved eigenvalues, a discussion of which in- 
frequently appears in previous literature (Table 5) 
[9,39]. Using a linear matrix inversion method in 
lieu of the nonlinear LM optimization resulted in 
departures from GIOP-DC of only 2%— 7% (MPD), 
with comparable AlOPs. Excluding R rs (670) from the 
inversion (that is, using only R rs from 400 to 600 nm) 
resulted in negligible differences from GIOP-DC and 
similar AIOP. Intuitively, we expected larger depar- 
tures given the significance of red R rs for SAA 
applications in turbid waters [1,40]. This analysis 
employed a synthesized data set, however, which 
does not comprehensively represent highly turbid, 
highly scattering environments. As will be explored 
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Fig. 2. Comparison of GIOP-DC and ground truth (in situ) IOPs 
at 443 nm from SeaWiFS (black) and MODISA (red). See Table 3 
for accompanying statistic. 


in detail later, an alternate choice of G{X) [22] pro- 
duced eigenvalues that deviated more significantly 
from GIOP-DC [6%-14% (MPD)] and resulted in 
elevated AIOP. 

4. Discussion 

We developed GIOP to provide a software environ- 
ment for developing and applying SAAs, not to 
introduce a novel SAA. While GIOP-DC (one configu- 
ration instance of GIOP) provides a viable SAA for 
use in global applications, it remains imperfect, 
and we recommend that this “default configuration” 
evolve over time as the community develops novel 
methods and achieves advanced insights into bio- 
optical modeling and f? rK (/l)-IOP closure. We initiated 
this exercise to illustrate that differences in most 
SAAs are limited to only the selection of eigenvectors 
and the inversion or optimization approach, and that 
a consolidated software framework such as GIOP 
provides a simple mechanism for moving ocean color 
science forward by allowing controlled SAA evalu- 
ation, regional tuning, dynamic eigenvector parame- 
terization based on optical water types (OWTs), and 
ensemble inversion modeling. While GIOP-DC itself 
performs decently for the data sets under inves- 
tigation (Tables 2 and 3) and comparably to other 
established algorithms [ 41 ] (see Appendix A for a 
quantitative discussion), we cannot expect reliability 
at all times in all water types, as is true for all glob- 
ally parameterized SAAs. We focus the following dis- 
cussion on lessons learned from its development and 
evaluation with the goal of recommending (outlining) 
directions for subsequent community research. 


Table 3. Regression Statistics for GIOP-DC Using the SeaWiFS and MODISA Match-Up Data Sets 






SeaWiFS 





MODISA 



N 

A 

Slope (SE) 

Ratio 

MPD 

N 

r 2 

Slope (SE) 

Ratio 

MPD 


412 

123 

0.30 

0.77 

(0.07) 

0.93 

25.2 

56 

0.69 

0.92 

(0.07) 

1.00 

13.2 


443 

123 

0.31 

0.79 

(0.07) 

0.92 

25.0 

56 

0.69 

0.95 

(0.08) 

1.02 

17.2 

^ bp 

490 

123 

0.32 

0.82 

(0.07) 

0.90 

24.2 

56 

0.69 

1.00 

(0.08) 

0.99 

18.2 

555“ 

123 

0.32 

0.84 

(0.07) 

0.87 

25.2 

56 

0.68 

1.05 

(0.09) 

0.99 

18.8 


670 6 

123 

0.31 

0.87 

(0.07) 

0.83 

28.5 

56 

0.63 

1.06 

(0.09) 

1.04 

23.0 


412 

192 

0.74 

1.12 

(0.04) 

0.87 

30.9 

21 

0.45 

0.76 

(0.14) 

0.89 

36.9 


443 

192 

0.81 

1.07 

(0.03) 

0.81 

25.5 

21 

0.73 

0.77 

(0.10) 

0.88 

16.9 

a 

490 

192 

0.80 

1.01 

(0.03) 

0.76 

29.3 

21 

0.84 

0.79 

(0.07) 

0.79 

21.1 


555“ 

192 

0.67 

1.03 

(0.05) 

0.68 

42.3 

21 

0.86 

0.74 

(0.07) 

0.75 

28.9 


670 6 

180 

0.69 

1.19 

(0.05) 

0.87 

45.0 

17 

0.47 

0.91 

(0.19) 

1.88 

87.8 


412 

192 

0.51 

1.12 

(0.06) 

0.86 

45.7 

20 

0.07 

0.96 

(0.27) 

0.88 

40.2 


443 

192 

0.51 

1.08 

(0.06) 

0.78 

49.8 

20 

0.09 

0.96 

(0.27) 

0.81 

34.8 

a dg 

490 

192 

0.48 

1.01 

(0.06) 

0.64 

54.2 

20 

0.11 

0.96 

(0.26) 

0.68 

41.8 

555“ 

191 

0.42 

0.93 

(0.06) 

0.50 

62.5 

20 

0.12 

0.96 

(0.26) 

0.46 

55.3 


670 6 

183 

0.44 

0.82 

(0.05) 

0.34 

72.0 

20 

0.13 

0.98 

(0.26) 

0.32 

67.7 


412 

195 

0.72 

1.17 

(0.05) 

0.73 

35.5 

25 

0.85 

1.14 

(0.09) 

0.91 

22.5 


443 

197 

0.68 

1.14 

(0.05) 

0.80 

31.5 

25 

0.82 

1.20 

(0.11) 

0.90 

31.0 

a $ 

490 

197 

0.68 

1.12 

(0.05) 

0.86 

30.1 

25 

0.81 

1.13 

(0.11) 

0.93 

33.2 


555“ 

186 

0.71 

1.17 

(0.05) 

0.95 

44.8 

25 

0.82 

1.14 

(0.10) 

0.97 

36.6 


670 6 

195 

0.73 

1.05 

(0.04) 

1.11 

48.6 

24 

0.81 

1.24 

(0.12) 

1.35 

49.3 


“indicates that the wavelength is 547 nm for MODISA. 
‘indicates that the wavelength is 667 nm for MODISA. 
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Table 4. Delta Statistics for Various Trophic Levels 9 


All 



NOMAD 



IOCCG 


N% 

Aftotai Med 

SIQR 

N % 

total Med 

SIQR 


AR rs 

90 

964 

1.68 

1.05 

90 

500 

1.04 

0.49 

Abbp 

97 

223 

26.94 

15.64 

87 

500 

8.52 

6.80 

Aa 

88 

735 

27.75 

17.28 

87 

500 

8.56 

5.90 

Aa dg 

90 

735 

51.02 

28.38 

87 

500 

27.25 

20.19 

A 

87 

780 

29.32 

19.17 

84 

500 

35.83 

23.25 

Oligotrophic 




NOMAD 




IOCCG 



N% 

■^total 

Med 

SIQR 

N% 

total 

Med 

SIQR 

A R rs 

98 

51 

1.14 

1.38 

100 

100 

0.59 

0.32 

A bbp 

100 

1 

93.27 

NA 

100 

100 

5.61 

3.00 

Aa 

94 

37 

53.56 

28.94 

100 

100 

6.26 

2.60 

A ajg 

97 

37 

99.83 

36.75 

100 

100 

28.38 

16.06 

Aa^ 

93 

43 

20.67 

11.75 

100 

100 

22.81 

11.42 

Mesotrophic 




NOMAD 




IOCCG 



N% 

■^total 

Med 

SIQR 

N% 

total 

Med 

SIQR 

A R IS 

90 

477 

1.56 

0.76 

100 

125 

1.22 

0.37 

&b bp 

97 

127 

33.65 

15.79 

100 

125 

6.51 

3.84 

Aa 

93 

326 

25.71 

16.23 

100 

125 

6.59 

3.22 

kCLdg 

96 

326 

46.53 

28.81 

100 

125 

30.72 

22.95 

Aa^, 

94 

353 

23.93 

13.22 

100 

125 

31.44 

19.20 

Eutrophic 




NOMAD 




IOCCG 



N% 

N total 

Med 

SIQR 

N % 

N total 

Med 

SIQR 

AR IS 

88 

436 

1.91 

1.51 

82 

275 

1.29 

0.70 

Afe 6p 

96 

95 

19.90 

12.06 

77 

275 

13.31 

11.04 

Aa 

82 

372 

28.25 

17.31 

76 

275 

14.56 

8.77 

A a dg 

84 

372 

48.47 

25.94 

77 

275 

25.00 

21.75 

A a# 

80 

384 

40.27 

30.36 

71 

275 

54.49 

34.19 


a N% provides the percentage of valid retrievals returned by GIOP-DC calculated as 100% * N VRiii /N tota i, where Ai valid is the 
number of valid retrievals (not shown). A1 totid , Med and SIQR provide the distribution sample size, median and semi- 
interquartile range, respectively. 


GIOP provides a resource for better understanding 
the sensitivity of an SAA to its assigned eigenvectors. 
Our sensitivity analyses permitted generation of a 
preliminary hierarchy of eigenvector significance, 
which can be used to focus future research on the pa- 
rameterizations of most importance. This approach 
cannot effectively identify an optimal set of eigenvec- 
tors and methods, as we sequentially varied just a 
single parameter and combinations of certain eigen- 
vectors mutually compensate for their respective 
errors. Spectral shapes for b bp (A), a dg (A), and a^X) de- 
crease from 440 to 660 nm, for example, and there- 
fore transfer variances when their shapes change 
within realistic bounds. We also expect a dg {A) and 
a ,/,().) to covary under certain environmental condi- 
tions, such as the open ocean where the detrital pool 
includes derivative products of phytoplankton, such 
as cells walls and cytoplasm. That said, our approach 
identifies the eigenvectors and methods with the 


largest impact on eigenvalue retrievals, and thus, 
reveals those requiring additional attention by the 
community. If we assume that the goal of any SAA 
is the best simultaneous retrieval of b bp (A), a (4), 
a dg (A), and a^A) (e.g., minimizing AIOP), the choice 
of S bp appears to be the least critical (Table 5). In con- 
trast, given the robust ability of SAAs to retrieve 
b bp (A) and a (A), and their reduced capacity to decon- 
volve total absorption into a dg (A ) and a^A), the choice 
of S dg and a^(A) appears to be highly significant. 
Using AIOP as the performance metric, the use of 
dynamically evolving S dg and a*p(A) produced supe- 
rior results to the use of static eigenvectors (Table 5). 
Despite this, spectral dependencies in the retrieved 
eigenvalues revealed in the match-up results 
(Tables 2 and 3) indicates compensation for imperfect 
eigenvectors. Elevated A a dg and A a,/, relative to Aa 
highlights the challenge of using SAA approaches 
in isolation to provide information on phytoplankton 
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Fig. 3. Frequency distributions of Af? rs from NOMAD (black) and 
the IOCCG data set (red) for the all available data (“All”) and sub- 
setted into three trophic levels. The main text provides definitions 
for the oligo-, meso-, and eutrophic subsets. See Table 4 for accom- 
panying statistics. 


community structure [for example, when multiple 
a*AX) are employed to retrieve eigenvalues for multi- 
ple phytoplankton groups]. Each successive de- 
composition of an IOP into component IOPs adds 
variability and ambiguity. 

The selection of G(X), and i? rs (4)-IOP closure 
in general, merits additional consideration by the 
research community. Ambiguity in the retrieval of 
IOPs from -R rs (A) remains a known issue (that is be- 
yond the scope of this work) and combinations of 
successfully retrieved eigenvalues may not make 
geophysical sense, despite numerical closure in the 
inversion [42,43]. That aside, the choice of G(A) from 
Gordon et al. [21] versus Morel et al. [22] resulted in 
eigenvalues that differed by 6%-14% (Table 5). An 
unexpected relationship between G(A), a^A), and 
C a emerged when evaluating these results (Fig. 8). 
The use of the Gordon quadratic expression [Gi(X) = 
0.0949 and G 2 (A) = 0.0794] resulted in a,/, (443) with 
little bias across the full dynamic range of NOMAD 
stations (see the red best-fit line with a slope near 
unity in panel B), but C a with a clear trophic bias 
(see the red best-fit line with a slope less than unity 
in panel A). In contrast, the look-up-table (LUT) ap- 
proach of Morel [Gi(4) from the LUT and G 2 (A) = 0] 


produced C a with little bias (panel C), but a^( 443) 
with a definitive slope greater than unity (panel D). 
Such variability in derived optical and biogeochem- 
ical products calls into question the ability of existing 
approaches to universally estimate G(A) and relate 
to C a within the SAA paradigm [39,44]. With 
regard to G(A), the authors acknowledge the limita- 
tions in their approaches. The quadratic coefficients 
from Gordon are valid for open ocean conditions and 
solar zenith angles >20°. The LUTs from Morel are 
valid for conditions where only phytoplankton and 
their derivative products shape the marine light 
field. Furthermore, navigating these LUTs requires 
an estimate of C a , a remotely sensed data product 
known to be valid in the open ocean, but less so 
in coastal or optically complex conditions. Several 
alternative methods for estimating G(X) in optically 
complex environments now exist [7,23]. While not 
considered in this study, GIOP provides a framework 
for their systematic comparision in subsequent stud- 
ies. With regard to estimating phytoplankton bio- 
mass, a paucity of comprehensive, global data sets 
with which to obtain robust relationships confounds 
our ability to universally toggle between a^(X) and C a 
in this paradigm. As C a and a^( 443) remain propor- 
tionally constant in GIOP-DC [a^(443) = 0.055 CJ, 
we expect much of the variability shown in Fig. 8 
to stem from variability in the in situ measurements. 
However, repeating this analysis without normal- 
izing ( 443 ) to 0.055 m 2 mg' 1 further amplified 

the disconnect between G(A), a, /t (A), and C „ , par- 
ticularly for C a > 1 mg m 3 (results not shown). With- 
out evolving to more sophisticated approaches 
(explored below), an end-user may have to decide 
between tuning an SAA to optics [a^(4); our choice] 
or biogeochemistry ( C a ) (Fig. 8), both of which have 
merit depending on the science question(s) under 
consideration. 

Clearly, the use of fixed eigenvectors within an 
SAA (or dynamic eigenvectors developed using lim- 
ited data sets) cannot capture their natural variation 
across different optical and biogeochemical condi- 
tions. Optical properties of the oceans vary over or- 
ders of magnitude globally and remain neither 
spatially uniform nor constant in time [45-49]. Novel 
methods now exist to constrain regional variability 
in optical parameters, several of which port easily 
into the GIOP framework. OWT classification ap- 
proaches provide one avenue for capturing spatial 
and temporal variations in optical parameters from 
measured R IS (X) [45,49]. The approach of Moore et al. 
[45], for example, assigns individual satellite pixels 
to specific OWTs and subsequently selects OWT- 
specific eigenvectors or algorithms. GIOP provides 
two paths to estimate the final eigenvalues for a 
given pixel: (1) the selected parameterizations are 
blended using OWT-weighted fuzzy membership 
functions and used to construct a single executable 
SAA instance; or (2) the selected parameterizations 
are used to construct multiple executable SAA 
instances and the resulting eigenvalues are blended 
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Fig. 4. Frequency distributions of AIOP from NOMAD (black) 
and the IOCCG data set (red) for all available data. See Table 4 
for accompanying statistics. 


using OWT-weighted fuzzy membership functions. 
Ensemble (or, bootstrapping) approaches provide 
another vehicle to allow for spatial and temporal 
variations in optical parameters [9,39]. The approach 


of Wang et al. [9], for example, successively iterates 
on a single satellite pixel using ranges of eigenvec- 
tors (e.g., S dg from 0.01 to 0.02 nm _1 , and S bp from 
-2 to 0 at assigned intervals) and outputs the median 
eigenvalues (and their standard deviations) for all 
valid retrievals. GIOP provides a mechanism to 
execute the successive iterations. Both series of 
methods remain unconstrained by geographic boun- 
daries and can be easily updated to accommodate 
additional OWT-specific parameterizations as addi- 
tional in situ data become available. 

We must consider the data sets employed in 
this investigation to fully interpret our results. 
The elevated (>1) and reduced (<1) ratios of b bp (X) 
and a (A) suggest incomplete f? rs (A)-IOP closure in 
NOMAD (Table 2). The R rs (A), b bp (X), and a(X) were 
collected using different instruments and methods, 
and uncertainties associated with varied measure- 
ments and methodologies are neither negligible 
nor spectrally independent [32]. The divergent ratios 
of a dg (X) and a^iX) with increasing wavelength likely 
result from low R r .(X) and absorption signals at long 
wavelengths and imperfect parameterization of S dg 
and S bp , both of which control the rate of decline 
of their eigenvectors. None of the data sets we consid- 
ered perfectly encapsulate all bio-optical conditions 
at all times, nor do they represent the relative global 
distribution of such conditions. NOMAD remains 
unevenly distributed among trophic levels, showing 
dominance in meso- and eutrophic waters, with spa- 
tial and temporal biases [32]. The SeaWiFS and 
MODISA match-ups results are equally biased. 
The IOCCG dataset achieves optical closure as it 
was synthesized using a forward model with simple 
optical relationships (e.g., Raman and fluorescence 
effects were not included), but suffers similarly in 
how it numerically represents the relative spatial 
distribution of coincident bio-optical properties. 
Likewise, no single validation method or statistic 
adequately captures the full performance of an 
SAA. In practice, the evaluation of any algorithm 


Table 5. Delta Statistics for the Sensitivity Analyses' 


Run 

N 



MPD 




Median 



&bp 

a 

a dg 


A R IS 


Aa 



GIOP-DC 

437 

NA 

NA 

NA 

NA 

1.04 

8.52 

8.56 

27.25 

35.83 

S bp - 33% 

440 

5.19 

5.17 

7.58 

2.98 

0.99 

11.23 

11.70 

32.14 

34.69 

S bp + 33% 

436 

5.65 

5.70 

8.82 

2.90 

1.14 

11.40 

10.70 

23.51 

39.12 

S dg - 33% 

448 

18.96 

33.44 

101.73 

46.59 

1.61 

16.27 

19.08 

32.94 

31.95 

S dg + 33% 

399 

3.77 

8.41 

40.10 

32.92 

1.23 

9.44 

8.95 

79.90 

59.32 

S dg from [7] 

439 

3.20 

5.33 

20.40 

14.58 

1.10 

8.65 

9.80 

22.25 

34.42 

C a - 33% in [14] 

419 

2.02 

2.92 

1.48 

7.25 

1.19 

8.79 

8.83 

28.62 

31.10 

C 0 + 33% in [14] 

437 

1.56 

2.28 

1.14 

5.90 

1.10 

8.12 

9.17 

26.79 

40.09 

Fixed C a in [14) 

369 

4.57 

7.89 

2.60 

21.68 

1.46 

11.30 

11.53 

30.97 

26.70 

a £ from [17] 

357 

8.33 

12.72 

7.04 

22.23 

1.20 

14.26 

16.75 

38.30 

23.13 

G from [22] 

422 

9.99 

6.15 

7.49 

14.12 

1.16 

11.50 

13.64 

37.49 

36.24 

Matrix inversion 

475 

4.60 

3.68 

2.24 

7.41 

1.73 

9.15 

9.43 

24.79 

36.82 

400 < X < 600 nm 

424 

0.23 

0.21 

0.08 

0.38 

0.92 

8.76 

8.78 

31.94 

36.55 


°N is the sample size. MPD is the average spectral median percent difference between GIOP-DC and each alternate run, as calculated 
in Tables 2 and 3. Medians of the AIOP frequency distributions are also presented, as presented in Table 4. 
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normalized uRMSD 



normalized uRMSD 



normalized uRMSD 


c 




normalized uRMSD 


Fig. 5. Taylor and Target diagrams for IOPs at 412 nm from the IOCCG data set for the 12 alternate parameterizations of GIOP com- 
pared to GIOP-DC. uRMSD is the unbiased root mean square difference. Symbols indicate the following: blue cross = Sgg - 33% 
(= 0.012 nm -1 ); red cross = S dg + 33% (= 0.024 nm -1 ); green circle = S dg dynamically calculated using Lee et al. [7]; blue square = 
S bp from Lee et al. [7] -33%; black square = S bp from Lee et al. [7] +33%; red circle = OC-derived C a - 33% prior to input into Bricaud 
et al. [14]; black circle = OC-derived C„ + 33% prior to input into Bricaud et al. [14]; green square = aJ(A) from Bricaud et al. [ 14 ] with C a 
fixed at 0.18 mgm -3 ; blue circle = a^(X) from Ciotti and Bricaud [17] with a size fraction of 0.5; black cross = G(A) from Morel et al. [22]; 
orange cross = optimization using linear matrix inversion; and green cross = optimization considering only 400 < X < 600 nm 


must consider the science question(s) to be ad- 
dressed, which subsequently leads to the definition 
of validation metrics. Inevitably, trade-offs emerge, 
such as: (1) the value in retrieving IOPs highly ac- 
curately at a single wavelength and poorly at other 
wavelengths versus moderately accurately at all 
wavelengths; (2) the value of retrieving IOPs with 
high accuracy and poor spatial and temporal cover- 
age versus moderately accurately with high spatial 
and temporal coverage; and (3) the value of retriev- 
ing some products with high accuracy [say, b bp (X)] 


and others with less accuracy [say, a d „(X)] versus 
all products [say, both b bp {X) and a dg {X)\ with mod- 
erate accuracy. Choices made in performance met- 
rics ultimately affect the use and interpretation 
of cumulative statistics. Interpreting A R rs , for 
example, provides little value for SAAs designed 
to accurately retrieve IOPs at single wavelengths 
at the expense of quality at other wavelengths 
(Fig. 9). 

Our choices in cost function for the inversion and 
goodness-of-fit metrics for the validation effort affect 
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normalized uRMSD 



normalized uRMSD 



normalized uRMSD 



normalized uRMSD 


Fig. 6. As in Fig. 5, but for IOPs at 443 nm. 


full interpretation of our results. The / 2 cost function 
[Eq. (11)] uses absolute values and considers wave- 
lengths from 400 to 700 nm. The AR rs [Eq. (10)] and 
AIOP [Eq. (13)] metrics report relative values (%) 
and consider only wavelengths from 400 to 600 nm. 
We knowingly adopted this inconsistency for this 
work. In red wavelengths (>600 nm), open ocean 
.R rs (/l) and IOPs can be sufficiently low, such that rel- 
ative (fractional) differences are large, but absolute 
differences are small. Conversely, in blue wave- 
lengths, relative differences are small, while absolute 
differences are large. We recommend that the com- 
bined consideration of absolute and relative spectral 
uncertainties be adopted in future studies. Criteria 
for the cost function and AR rs , for example, could be 
based on maxima of assigned absolute values and 
relative fractions of R rs (/l) for ah wavelengths. Doing 


so would not only ensure consistency between the cost 
function and the validation goodness-of-fit metrics, 
but also eliminate the need to constrain the latter 
to 400-600 nm. 

The choice of input uncertainties for i? rs (A) affects 
both SAA performance and the derived uncertainties 
on IOP retrievals. Reliable uncertainties for 
satellite-derived R r JA) remain difficult to accurately 
determine. The use of signal-to-noise ratio (SNR) 
estimates for the satellite instrument and of statis- 
tical measures based on agreement between satellite 
retrievals and in situ measurements have both been 
explored [50-52] . However, these approaches are 
restricted to the assignment of a single, global uncer- 
tainty estimate to each wavelength (even the latter 
approach, due to the limited availability and geo- 
graphic distribution of such match-ups). In practice, 
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normalized uRMSD 
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Fig. 7. As in Fig. 5, but for IOPs at 555 nm. 


errors in the atmospheric correction process and 
changes in the instrument radiometric performance 
over time complicate uncertainties in satellite R rs (l). 
Remote measurements collected at large viewing 
angles (large atmospheric path lengths) or through 
elevated aerosol loads, where the water-leaving sig- 
nal is a much smaller portion of the total observed 
radiance at the sensor, have higher R rs (A) uncertain- 
ties than those collected through shorter atmospheric 
paths or clear atmospheres dominated by simple 
Rayleigh scattering. Turbid or highly reflective 
waters require additional corrections to separate the 
atmospheric signal from the water signal, which adds 
additional assumptions and inherent uncertainties 
[53], Similarly, R rs (A) retrievals obtained in the vicin- 
ity of land, clouds, or sun glint may be contaminated 
by stray light or atmospheric adjacency effects. 


Finally, the radiometric sensitivity of a satellite 
instrument often degrades over time as the satellite 
ages, leading to decreased SNRs and increased 
instrumental uncertainties [54]. In practice, uncer- 
tainties for R rs (A) vary spatially and temporally, and 
this variability includes changes in both their magni- 
tude and spectral shape. 

The primary impact of uncertainties for R rs (A) on 
the IOP model optimization process, as implemented 
for GIOP, is to change the spectral weighting of the^ 2 
minimization [Eq. (11)]. For example, if the assigned 
variation does not represent the true relative uncer- 
tainty at each wavelength, the minimization process 
will be skewed to improperly favor specific wave- 
lengths and ignore others, and the resulting model 
will not necessarily reproduce the spectral depend- 
ence of f? rs (l) to within their assigned uncertainties. 
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Gordon et al. (1988) Morel et al. (2002) 




Fig. 8. Comparison of GIOP-DC and in situ measurements of C a (panels A and C) and a^(443) (panels B and D) from NOMAD using G(X) 
parameterized by Gordon et al. [21] (panels A and B) and Morel et al. [22] (panels C and D). The black line indicates a 1 : 1 relationship. The 
red line indicates the best fit. The slopes of the best fit lines are 0.84, 1.06, 0.91, and 1.16 for panels A-D, respectively. 


Spectrally constant error in the magnitudes of uncer- 
tainties for R TS (A) will have no impact on the fitting 
results. The derived IOP uncertainties, however, 
are directly proportional to the magnitude of the 
assigned uncertainties for R TS (A). 

Currently, GIOP-DC does not assign uncertainties 
to R rs (A) and uses the variance-covariance matrix to 
derive the IOP uncertainties. This solution effec- 
tively assumes a spectrally flat distribution for i? rs (/l) 
uncertainties. In practice, such an unweighted ap- 
proach yields very similar IOP retrieval results as 
would using global mean uncertainties for i? rs U) 
from match-up analyses, as the latter tends toward 
relatively flat spectral dependence in the blue-green 
regime [51,52]. The use of the variance-covariance 
matrix likely understates the IOP retrieval uncer- 
tainties, since it effectively uses spectral agreement 
between the model and the measurements to esti- 
mate inherent variability in the -R rs (l) measure- 
ments and, thus, ignores spectrally independent 
error in R rs (A). The community ultimately needs a 
reliable method for assigning R r JA) uncertainties 
to each remote sensing observation based on a de- 
tailed error budget of the instrument calibration 
and atmospheric correction process. Characterizing 
the -RrsOf) match-up statistics based on viewing 
geometry, aerosol optical thickness, turbidity, or 
other relevant variables, within the limitations of 
the available satellite-to-m situ match-up database, 
provides an intermediate option [55,56] . Until we ac- 
quire improved knowledge of R IS (A ) uncertainties, 
the assumption of global uncertainties or the use 
of an unweighted optimization provide viable 
options. 


5. Conclusions and Future Directions 

GIOP provides a consolidated software environment 
for developing, applying, and evaluating ocean color 
SAAs to retrieve marine IOPs. While our primary 
contribution remains the GIOP framework itself, 
its community-recommended default configuration, 



0 0.02 0.04 0.06 0.08 0.1 

Fig. 9. MODISA A R rs for GIOP-DC (panel A) and GSM (run using 
GIOP; panel B). GIOP was applied to the monthly MODISA level-3 
bin file for March 2010. Units are nondimensional (0.1 = 10%). 
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GIOP-DC, produces global IOPs of comparable qual- 
ity to other common algorithms [41], As GIOP can 
easily accommodate new eigenvectors and advanced 
approaches as the research community evolves, we 
anticipate and recommend that GIOP-DC be up- 
dated routinely. Several features recently included, 
or in the queue for inclusion, into GIOP include: 

(1) temperature and salinity dependent a w (A) and 
b bw (X) [29]; (2) alternate mathematical inversion 
approaches; (3) alternate eigenvector parameteriza- 
tions; (4) alternate R rs (A)-IOP relationships [e.g., 
G(A) from Lee [23]]; (5) consideration of Raman in- 
elastic scattering; (6) ensemble solution methods, 
such as Wang et al. [9] and Brando et al. [39]; and 
(7) dynamic configuration based on detected OWTs, 
such as Vantrepotte et al. [ 49 ] and Moore et al. [ 45 ] 
(provisional parameters for which were recently de- 
rived). We hope our sensitivity analyses and sub- 
sequent discussion will provide the community 
guidance on future directions for in situ data collec- 
tion and algorithm refinement to support advancing 
SAAs (through GIOP) and their application. We also 
expect the GIOP framework to facilitate analyses as- 
sociated with new mission planning. Its inherent 
ability to operate on any array of wavelengths, for ex- 
ample, provides a resource for identifying new chan- 
nels to be added to forthcoming satellite instruments 
(e.g., ultraviolet bands). 

Appendix A 

In practice, most common, published SAAs fall into 
three broad classes, hereafter referred to as spectral 
optimization, spectral deconvolution, and bulk inver- 
sion. In this Appendix, we briefly introduce each 
class with attention to the interoperability of the 
GIOP framework and several widely use SAAs from 
each class. Note, this classification includes so-called 
inversion algorithms [those that derive IOPs from 
R r AA) via inverse solutions to Eqs. (2) and (3)] and 
does not explicitly consider empirical (statistical) 
or neural-network approaches. 

SAAs in the spectral optimization class operate in 
the manner described for GIOP in Section 2B. That 
is, eigenvectors are predefined [e.g., for b t ^ “*(*>* 
and a'p(A)] and simultaneous solutions for the 
eigenvalues (e.g., for B bp , A dg , and A are achieved 
via linear (matrix) or nonlinear (least squares) opti- 
mization of Eq. (9). The system is overdetermined if 
N a exceeds the number of unknowns. Examples in- 
clude the SAAs described in Roesler and Perry [3], 
Hoge and Lyon [4] , Garver and Siegel [5] , Maritorena 
et al. [8], Wan get al. [9], and Devred et al. [10]. These 
SAAs predominantly differ in their choice of eigen- 
vectors and inversion method and, in principle, GIOP 
can be configured to mimic each of them. For exam- 
ple, the Garver-Siegel-Maritorena (GSM) algorithm 
can be executed within the GIOP framework by 
assigning user-defined S bp , S dg , and a^(A) from 
Maritorena et al. [8], G(X) from Gordon et al. [21], 
and LM optimization. IOPs derived using GSM 
and GIOP with a GSM-like configuration compared 


extremely well for the NOMAD, IOCCG, and match- 
up data sets (results not shown). Validation results 
for GIOP-DC and GSM also compared favorably with 
MPD for GSM-derived 6^(443), a(443), a^(443), 
and a^(443) at 22%, 27%, 29%, and 40% for NOMAD 
and 22%, 26%, 20%, and 52% for the IOCCG data set 
(Figs. 10-12; see Table 2 for equivalent GIOP-DC sta- 
tistics). The nuances in quality assurance metrics, 
success and failure conditions, and fail-safe behavior 
that accompany each SAA listed above, however, 
may not be currently available within the GIOP 
framework. 

SAAs in the spectral deconvolution class simi- 
larly assign eigenvectors, but operate in a step-wise 
fashion to determine the spectral backscattering and 
absorption coefficients, rather than optimizing si- 
multaneous solutions of the eigenvalues. Examples 
include the SAAs described in Lee et al. [7] [the 
quasi-analytic algorithm (QAA)], Smyth et al. [11], 
and Pinkerton et al. [57]. Broadly speaking, SAAs 
in this class operate via the following steps: 

(1) Assign S bp , S dg , and c,/, [a partial eigenvector 
for aJ(A) defined as a1(412)/aj(443)]. 

(2) Estimate b bp (A 0 ), where A 0 is typically a green 
wavelength. 

(3) Calculate b bp (A ) as the product of b bp (A 0 ) 
and Eq. (8) (requires S bp ). 



0.001 0.002 0.005 0.01 0.02 0.05 0.1 

Fig. 10. MODISA 6^(443) for GIOP-DC (panel A), GSM (run us- 
ing GIOP; panel B), and QAA (panel C). The algorithms were ap- 
plied to the monthly MODISA level-3 bin file for March 2010. 
Units are m -1 . 
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(4) Calculate a{X) using b bp (A) and Eq. (2) 
[requires G(A)]. 

(5) Estimate a dg (A 0 ), where A 0 is typically a blue 
wavelength (requires S dg and 

(6) Calculate a dg (A) as the product of a,i g (A d ) and 
Eq. (5) (requires S dg ). 

(7) Calculate a^(l) as a {A) - a w (A) - a dg (A). 

Note that b bp (A 0 ) and a dg (A 0 ) are equivalent to B hp 
and A dg , respectively, in Eq. (9). These SAAs differ 
in their assignment of S bp , S dg , and e^, and in their 
treatment of steps 2 and 4. A complete review of the 
differences exceeds the scope of this paper, however, 
several merit mentioning. Both Smyth et al. [ 11 ] and 
Pinkerton et al. [ 57 ] assign constant values for S bp , 
S dg , and c y/ „ whereas Lee et al. [7] dynamically esti- 
mates all three using empirical relationships based 
on R rs (A). GIOP supports the Lee et al. [7] estimates 
of S bp and S dg , with the former included as part of 
GIOP-DC (Table 1). Both Smyth et al. [ 11 ] and 
Pinkerton et al. [ 57 ] adopt iterative approaches to 
deriving b bp (A 0 ) (step 2) and G(A) (step 4). Using a 
LUT for G(A ) that is keyed on environmental geom- 
etries and absorption and scattering coefficients, 
both SAAs iterate until either the selected G(A) or de- 
rived a(A ) stabilize. In contrast, Lee [7] adopts G(A) 
from Gordon et al. [ 21 ] with modified coefficients 
and dynamically estimates a(l 0 ) using an empirical 
relationship based on R rs (A), which is in turn used to 
derive b bp (A 0 ) via rearrangement of Eqs. (2) and (3). 
Note that SAAs in this class can be halted at step 4 



0.001 0.005 0.02 0.1 0.5 

Fig. 11. As in Figure 10, but for a dg ( 443). Units are m -1 . 


to enable testing or application of alternate ap- 
proaches to decompose a(A) into its component parts 
(e.g., [ 17 ] and [58]). At this time, GIOP does not sup- 
port the step-wise deconvolution approach typical 
of this SAA class. However, validation results for 
GIOP-DC and QAA compared favorably with MPD 
for QAA-derived 6 6p (443), a(443), a dg (AAZ), and 
a 0 (443) at 38%, 21%, 30%, and 28% for NOMAD 
and 16%, 14%, 19%, and 61% for the IOCCG data 
set (Figs. 10-12; see Table 2 for equivalent GIOP- 
DC statistics). Per their design, SAAs in this class 
always report A R rs = 0, unlike SAAs in the spectral 
optimization class. 

SAAs in the bulk inversion class do not assign ei- 
genvectors, that is, they do not predefine spectral 
shapes for the absorption or scattering coefficients. 
The approach introduced in Loisel and Stramski 
(LAS) [6] provides a widely used example. Briefly, 
LAS exploits a relationship between the diffuse at- 
tenuation coefficient for downwelling irradiance, 
K,i(A) (m -1 ), solar zenith angle, and the absorption 
and scattering coefficients (e.g., [59]). As such, it re- 
quires the remote estimation of K d (A) from R rs (A). 
LAS sequentially estimates b bp (Ai) and a (A,) at each 
wavelength A t using K d (Aj), environmental geom- 
etries, and LUTs derived from radiative transfer 
(Monte Carlo) simulations. By estimating IOPs at 
each wavelength independently, spectral shape func- 
tions, such as S bp , can be calculated dynamically 
and considered output products [60]. GIOP supports 
the use of LAS-derived S bp and (tabulated) b* bp (A) as 
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Fig. 12. As in Figure 10, but for a^(443). Units are m 1 . 
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assignable eigenvectors (Table 1). GIOP does not 
support, however, the spectrally independent inver- 
sion approach typical of this SAA class. 
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